Using SVM to Extract Acronyms from Text
نویسندگان
چکیده
The paper addresses the problem of extracting acronyms and their expansions from text. We propose a support vector machines (SVM) based approach to deal with the problem. First, all likely acronyms are identified using heuristic rules. Second, expansion candidates are generated from surrounding text of acronyms. Last, SVM model is employed to select the genuine expansions. Analysis shows that the proposed approach has the advantages of saving over the conventional rule based approaches. Experimental results show that our approach outperforms the baseline method of using rules. We also show that the trained SVM model is generic and can adapt to other domains easily.
منابع مشابه
Using Compression to Identify Acronyms in Text
Text mining is about looking for patterns in natural language text, and may be de ned as the process of analyzing text to extract information from it for particular purposes. In previous work, we claimed that compression is a key technology for text mining, and backed this up with a study that showed how particular kinds of lexical tokens|names, dates, locations, etc.|can be identi ed and locat...
متن کاملA Term Recognition Approach to Acronym Recognition
We present a term recognition approach to extract acronyms and their definitions from a large text collection. Parenthetical expressions appearing in a text collection are identified as potential acronyms. Assuming terms appearing frequently in the proximity of an acronym to be the expanded forms (definitions) of the acronyms, we apply a term recognition method to enumerate such candidates and ...
متن کاملTranslation of Acronyms, Initialisms and Abbreviations (AIA) in Persian Political and Sport Journalistic Texts
The different writing systems of English and Persian makes translation of acronyms, initialisms and abbreviations challenging. This study aimed at finding which strategies were applied most frequently in translating acronyms, initialisms and abbreviations from English to Persian especially in journalistic texts. The study was done based n Descriptive Translation Study of Toury and strategies pr...
متن کاملEmotion Detection in Persian Text; A Machine Learning Model
This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...
متن کاملSVM Categorizer: A Generic Categorization Tool Using Support Vector Machines
Supervised text categorisation is a significant tool considering the vast amount of structured, unstru ctured, or semi-structured texts that are available from internal or external enterprise resources. The goal of supervised text categorisation is to assign text documents to finite pre -specified categories in order to extract and automatically organise information coming from th ese resources...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Soft Comput.
دوره 11 شماره
صفحات -
تاریخ انتشار 2007